#Variables to put into model
Response Variables * Effect (Y/N) - Binary Data * Effect (Y/N) - Binary Data x Effect Score (binned organism level effects)
#Continous Variable Distributions
There are several continuous variables that we want to feed into our model. Before doing so, we’re going to check the distribution to see if any of them are skewed and need to be transformed.
Due to skewed data, the following categories are log10 transformed before modeling:
All Independent Variables, Response Variable: Effect (Y/N)
Full Model
## $nlevels
## shape_f org_f life_f bio_f acute.chronic_f
## 3 3 3 5 2
##
## $levels
## $levels$shape_f
## [1] "Fiber" "Fragment" "Sphere"
##
## $levels$org_f
## [1] "Crustacea" "Fish" "Mollusca"
##
## $levels$life_f
## [1] "Adult" "Early" "Juvenile"
##
## $levels$bio_f
## [1] "Cell" "Organism" "Population" "Subcell" "Tissue"
##
## $levels$acute.chronic_f
## [1] "Acute" "Chronic"
##
## Call:
## lm(formula = effect_10 ~ ., data = aoc_setup_select_1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.6570 -0.3312 -0.1912 0.4720 1.0980
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.00445 0.14865 6.757 1.64e-11
## log.dose.particles.mL.master -0.11159 0.12526 -0.891 0.373059
## log.dose.mg.L.master 0.04305 0.00792 5.435 5.84e-08
## log.dose.um3.mL.master 0.09858 0.12554 0.785 0.432364
## log.size.length.um.used.for.conversions -0.35564 0.37626 -0.945 0.344615
## shape_fFragment -0.25198 0.04644 -5.426 6.15e-08
## shape_fSphere -0.26325 0.10149 -2.594 0.009533
## org_fFish -0.07452 0.02352 -3.169 0.001543
## org_fMollusca -0.16489 0.02650 -6.223 5.47e-10
## life_fEarly -0.06019 0.02189 -2.750 0.005997
## life_fJuvenile -0.11565 0.02356 -4.909 9.57e-07
## bio_fOrganism -0.33858 0.04773 -7.094 1.57e-12
## bio_fPopulation -0.61578 0.09575 -6.431 1.44e-10
## bio_fSubcell -0.12181 0.04525 -2.692 0.007141
## bio_fTissue -0.18136 0.05487 -3.305 0.000959
## log.exposure.duration.d 0.06250 0.01479 4.225 2.45e-05
## acute.chronic_fChronic 0.04233 0.02363 1.792 0.073272
##
## (Intercept) ***
## log.dose.particles.mL.master
## log.dose.mg.L.master ***
## log.dose.um3.mL.master
## log.size.length.um.used.for.conversions
## shape_fFragment ***
## shape_fSphere **
## org_fFish **
## org_fMollusca ***
## life_fEarly **
## life_fJuvenile ***
## bio_fOrganism ***
## bio_fPopulation ***
## bio_fSubcell **
## bio_fTissue ***
## log.exposure.duration.d ***
## acute.chronic_fChronic .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.4375 on 3499 degrees of freedom
## Multiple R-squared: 0.1078, Adjusted R-squared: 0.1037
## F-statistic: 26.42 on 16 and 3499 DF, p-value: < 2.2e-16
Stepwise Model - Both Directions
##
## Call:
## lm(formula = effect_10 ~ log.dose.particles.mL.master + log.dose.mg.L.master +
## log.size.length.um.used.for.conversions + shape_f + org_f +
## life_f + bio_f + log.exposure.duration.d + acute.chronic_f,
## data = aoc_setup_select_1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.6638 -0.3313 -0.1912 0.4728 1.0969
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.908561 0.084751 10.720 < 2e-16
## log.dose.particles.mL.master -0.013430 0.007951 -1.689 0.09130
## log.dose.mg.L.master 0.043341 0.007911 5.478 4.60e-08
## log.size.length.um.used.for.conversions -0.060808 0.024323 -2.500 0.01246
## shape_fFragment -0.260118 0.045263 -5.747 9.87e-09
## shape_fSphere -0.193184 0.048372 -3.994 6.64e-05
## org_fFish -0.074439 0.023514 -3.166 0.00156
## org_fMollusca -0.164055 0.026476 -6.196 6.45e-10
## life_fEarly -0.058559 0.021791 -2.687 0.00724
## life_fJuvenile -0.114927 0.023539 -4.882 1.09e-06
## bio_fOrganism -0.338795 0.047723 -7.099 1.51e-12
## bio_fPopulation -0.616139 0.095743 -6.435 1.40e-10
## bio_fSubcell -0.121917 0.045250 -2.694 0.00709
## bio_fTissue -0.181486 0.054869 -3.308 0.00095
## log.exposure.duration.d 0.062327 0.014790 4.214 2.57e-05
## acute.chronic_fChronic 0.041786 0.023615 1.770 0.07690
##
## (Intercept) ***
## log.dose.particles.mL.master .
## log.dose.mg.L.master ***
## log.size.length.um.used.for.conversions *
## shape_fFragment ***
## shape_fSphere ***
## org_fFish **
## org_fMollusca ***
## life_fEarly **
## life_fJuvenile ***
## bio_fOrganism ***
## bio_fPopulation ***
## bio_fSubcell **
## bio_fTissue ***
## log.exposure.duration.d ***
## acute.chronic_fChronic .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.4374 on 3500 degrees of freedom
## Multiple R-squared: 0.1076, Adjusted R-squared: 0.1038
## F-statistic: 28.15 on 15 and 3500 DF, p-value: < 2.2e-16
Full Model
##
## Call:
## glm(formula = effect_10 ~ logdose.mg.L.master * size.length.um.used.for.conversions,
## family = "binomial", data = m1_crust, na.action = "na.exclude")
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.8447 -0.7063 -0.6178 -0.5142 2.0485
##
## Coefficients:
## Estimate Std. Error
## (Intercept) -1.425e+00 7.388e-02
## logdose.mg.L.master 1.807e-01 4.837e-02
## size.length.um.used.for.conversions -3.807e-04 1.785e-04
## logdose.mg.L.master:size.length.um.used.for.conversions 1.057e-04 4.555e-05
## z value Pr(>|z|)
## (Intercept) -19.288 < 2e-16 ***
## logdose.mg.L.master 3.737 0.000187 ***
## size.length.um.used.for.conversions -2.133 0.032944 *
## logdose.mg.L.master:size.length.um.used.for.conversions 2.320 0.020367 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1311.5 on 1310 degrees of freedom
## Residual deviance: 1272.9 on 1307 degrees of freedom
## AIC: 1280.9
##
## Number of Fisher Scoring iterations: 4
Full Model
##
## Call:
## glm(formula = effect_10 ~ logdose.um3.mL.master * size.length.um.used.for.conversions,
## family = "binomial", data = m1_crust, na.action = "na.exclude")
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.8116 -0.6866 -0.6282 -0.5292 2.0869
##
## Coefficients:
## Estimate Std. Error
## (Intercept) -2.176e+00 2.574e-01
## logdose.um3.mL.master 1.262e-01 4.049e-02
## size.length.um.used.for.conversions -1.176e-03 4.464e-04
## logdose.um3.mL.master:size.length.um.used.for.conversions 1.259e-04 4.533e-05
## z value Pr(>|z|)
## (Intercept) -8.451 < 2e-16 ***
## logdose.um3.mL.master 3.118 0.00182 **
## size.length.um.used.for.conversions -2.635 0.00842 **
## logdose.um3.mL.master:size.length.um.used.for.conversions 2.777 0.00549 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1311.5 on 1310 degrees of freedom
## Residual deviance: 1277.6 on 1307 degrees of freedom
## AIC: 1285.6
##
## Number of Fisher Scoring iterations: 4
##
## Call:
## glm(formula = effect_10 ~ logdose.um3.mL.master * size.length.um.used.for.conversions,
## family = "binomial", data = m1_crust, na.action = "na.exclude")
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.8116 -0.6866 -0.6282 -0.5292 2.0869
##
## Coefficients:
## Estimate Std. Error
## (Intercept) -2.176e+00 2.574e-01
## logdose.um3.mL.master 1.262e-01 4.049e-02
## size.length.um.used.for.conversions -1.176e-03 4.464e-04
## logdose.um3.mL.master:size.length.um.used.for.conversions 1.259e-04 4.533e-05
## z value Pr(>|z|)
## (Intercept) -8.451 < 2e-16 ***
## logdose.um3.mL.master 3.118 0.00182 **
## size.length.um.used.for.conversions -2.635 0.00842 **
## logdose.um3.mL.master:size.length.um.used.for.conversions 2.777 0.00549 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1311.5 on 1310 degrees of freedom
## Residual deviance: 1277.6 on 1307 degrees of freedom
## AIC: 1285.6
##
## Number of Fisher Scoring iterations: 4
##
## Call:
## glm(formula = effect_10 ~ logdose.particles.mL.master * size.length.um.used.for.conversions,
## family = "binomial", data = m1_crust_part, na.action = "na.exclude")
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.5942 -0.6724 -0.6214 -0.5551 2.0530
##
## Coefficients:
## Estimate
## (Intercept) -1.808e+00
## logdose.particles.mL.master 7.178e-02
## size.length.um.used.for.conversions 2.881e-04
## logdose.particles.mL.master:size.length.um.used.for.conversions 1.331e-04
## Std. Error
## (Intercept) 1.355e-01
## logdose.particles.mL.master 2.177e-02
## size.length.um.used.for.conversions 6.364e-05
## logdose.particles.mL.master:size.length.um.used.for.conversions 4.499e-05
## z value
## (Intercept) -13.342
## logdose.particles.mL.master 3.297
## size.length.um.used.for.conversions 4.528
## logdose.particles.mL.master:size.length.um.used.for.conversions 2.959
## Pr(>|z|)
## (Intercept) < 2e-16 ***
## logdose.particles.mL.master 0.000978 ***
## size.length.um.used.for.conversions 5.96e-06 ***
## logdose.particles.mL.master:size.length.um.used.for.conversions 0.003090 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1311.5 on 1310 degrees of freedom
## Residual deviance: 1280.0 on 1307 degrees of freedom
## AIC: 1288
##
## Number of Fisher Scoring iterations: 4
plot
### volume##
## Call:
## glm(formula = effect_10 ~ logdose.um3.mL.master * size.length.um.used.for.conversions,
## family = "binomial", data = m1_crust_volume, na.action = "na.exclude")
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.8116 -0.6866 -0.6282 -0.5292 2.0869
##
## Coefficients:
## Estimate Std. Error
## (Intercept) -2.176e+00 2.574e-01
## logdose.um3.mL.master 1.262e-01 4.049e-02
## size.length.um.used.for.conversions -1.176e-03 4.464e-04
## logdose.um3.mL.master:size.length.um.used.for.conversions 1.259e-04 4.533e-05
## z value Pr(>|z|)
## (Intercept) -8.451 < 2e-16 ***
## logdose.um3.mL.master 3.118 0.00182 **
## size.length.um.used.for.conversions -2.635 0.00842 **
## logdose.um3.mL.master:size.length.um.used.for.conversions 2.777 0.00549 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1311.5 on 1310 degrees of freedom
## Residual deviance: 1277.6 on 1307 degrees of freedom
## AIC: 1285.6
##
## Number of Fisher Scoring iterations: 4
plot
Full Model
##
## Call:
## glm(formula = effect_10 ~ logdose.particles.mL.master * size.length.um.used.for.conversions,
## family = "binomial", data = m1_crust, na.action = "na.exclude")
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.5942 -0.6724 -0.6214 -0.5551 2.0530
##
## Coefficients:
## Estimate
## (Intercept) -1.808e+00
## logdose.particles.mL.master 7.178e-02
## size.length.um.used.for.conversions 2.881e-04
## logdose.particles.mL.master:size.length.um.used.for.conversions 1.331e-04
## Std. Error
## (Intercept) 1.355e-01
## logdose.particles.mL.master 2.177e-02
## size.length.um.used.for.conversions 6.364e-05
## logdose.particles.mL.master:size.length.um.used.for.conversions 4.499e-05
## z value
## (Intercept) -13.342
## logdose.particles.mL.master 3.297
## size.length.um.used.for.conversions 4.528
## logdose.particles.mL.master:size.length.um.used.for.conversions 2.959
## Pr(>|z|)
## (Intercept) < 2e-16 ***
## logdose.particles.mL.master 0.000978 ***
## size.length.um.used.for.conversions 5.96e-06 ***
## logdose.particles.mL.master:size.length.um.used.for.conversions 0.003090 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1311.5 on 1310 degrees of freedom
## Residual deviance: 1280.0 on 1307 degrees of freedom
## AIC: 1288
##
## Number of Fisher Scoring iterations: 4
##
## Call:
## glm(formula = effect_10 ~ logdose.mg.L.master * size_f, family = "binomial",
## data = m1_crust, na.action = "na.exclude")
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.7085 -0.7062 -0.6167 -0.4255 2.3881
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -1.57691 0.21094 -7.476 7.67e-14 ***
## logdose.mg.L.master 0.40501 0.16344 2.478 0.0132 *
## size_f100nm < 1µm -0.01864 0.37454 -0.050 0.9603
## size_f1µm < 100µm 0.17395 0.22858 0.761 0.4467
## size_f100µm < 1mm -0.25876 0.35626 -0.726 0.4676
## size_f1mm < 5mm -0.87511 0.64223 -1.363 0.1730
## logdose.mg.L.master:size_f100nm < 1µm 0.45050 0.31359 1.437 0.1508
## logdose.mg.L.master:size_f1µm < 100µm -0.24482 0.17319 -1.414 0.1575
## logdose.mg.L.master:size_f100µm < 1mm -0.23277 0.22994 -1.012 0.3114
## logdose.mg.L.master:size_f1mm < 5mm 0.04129 0.21904 0.189 0.8505
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1358.5 on 1347 degrees of freedom
## Residual deviance: 1303.8 on 1338 degrees of freedom
## AIC: 1323.8
##
## Number of Fisher Scoring iterations: 5
Plotted below. #### Acute only crustacea
##
## Call:
## glm(formula = effect_10 ~ logdose.mg.L.master * size_f + logdose.particles.mL.master *
## size_f, family = "binomial", data = m1_crust_acute, na.action = "na.exclude")
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.4072 -0.8096 -0.6639 -0.2631 2.1764
##
## Coefficients:
## Estimate Std. Error z value
## (Intercept) -30.504 9.991 -3.053
## logdose.mg.L.master -2.914 1.075 -2.711
## size_f100nm < 1µm 29.153 10.270 2.839
## size_f1µm < 100µm 28.853 9.996 2.886
## size_f100µm < 1mm 28.939 10.003 2.893
## logdose.particles.mL.master 2.912 1.001 2.908
## logdose.mg.L.master:size_f100nm < 1µm 3.637 1.165 3.123
## logdose.mg.L.master:size_f1µm < 100µm 3.002 1.078 2.785
## logdose.mg.L.master:size_f100µm < 1mm 2.298 1.148 2.002
## size_f100nm < 1µm:logdose.particles.mL.master -2.905 1.047 -2.773
## size_f1µm < 100µm:logdose.particles.mL.master -2.767 1.004 -2.756
## size_f100µm < 1mm:logdose.particles.mL.master -1.988 1.153 -1.725
## Pr(>|z|)
## (Intercept) 0.00227 **
## logdose.mg.L.master 0.00672 **
## size_f100nm < 1µm 0.00453 **
## size_f1µm < 100µm 0.00390 **
## size_f100µm < 1mm 0.00382 **
## logdose.particles.mL.master 0.00364 **
## logdose.mg.L.master:size_f100nm < 1µm 0.00179 **
## logdose.mg.L.master:size_f1µm < 100µm 0.00535 **
## logdose.mg.L.master:size_f100µm < 1mm 0.04526 *
## size_f100nm < 1µm:logdose.particles.mL.master 0.00555 **
## size_f1µm < 100µm:logdose.particles.mL.master 0.00585 **
## size_f100µm < 1mm:logdose.particles.mL.master 0.08459 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 945.92 on 853 degrees of freedom
## Residual deviance: 904.63 on 842 degrees of freedom
## AIC: 928.63
##
## Number of Fisher Scoring iterations: 5
The above glm is visualized below.
This approach achieves a similiar product as using the ggPredict() function, except it relies on ggplot(), which is more malleable and transparent. The general steps are to first create a new dataframe over 1000 values of size using expand.grid() then use predict() and plot() with geom_line() and colour=size.
### Probability of Survival by Time